Phrase-Based Transliteration with Simple Heuristics
نویسندگان
چکیده
This paper presents modeling of transliteration as a phrase-based machine translation system. We used a popular phrasebased machine translation system for English-Hindi machine transliteration. We have achieved an accuracy of 38.1% on the test set. We used some basic rules to modulate the existing phrased-based transliteration system. Our experiments show that phrase-based machine translation systems can be adopted by modulating the system to fit the transliteration problem.
منابع مشابه
A Bayesian model of bilingual segmentation for transliteration
In this paper we propose a novel Bayesian model for unsupervised bilingual character sequence segmentation of corpora for transliteration. The system is based on a Dirichlet process model trained using Bayesian inference through blocked Gibbs sampling implemented using an efficient forward filtering/backward sampling dynamic programming algorithm. The Bayesian approach is able to overcome the o...
متن کاملSyllable-based Machine Transliteration with Extra Phrase Features
This paper describes our syllable-based phrase transliteration system for the NEWS 2012 shared task on English-Chinese track and its back. Grapheme-based Transliteration maps the character(s) in the source side to the target character(s) directly. However, character-based segmentation on English side will cause ambiguity in alignment step. In this paper we utilize Phrase-based model to solve ma...
متن کاملThe Application of Bayesian Alignment Techniques to Transliteration Generation and Mining
Bayesian techniques have recently been applied to many areas of natural language processing, and have proven themselves particularly useful in areas involving segmentation and alignment. This paper looks at the direct application of these techniques to the co-segmentation/alignment of grapheme sequences. We detail a novel Bayesian model for unsupervised bilingual character sequence alignment of...
متن کاملBilingual Dictionary Construction with Transliteration Filtering
In this paper we present a bilingual transliteration lexicon of 170K Japanese-English technical terms in the scientific domain. Translation pairs are extracted by filtering a large list of transliteration candidates generated automatically from a phrase table trained on parallel corpora. Filtering uses a novel transliteration similarity measure based on a discriminative phrase-based machine tra...
متن کاملA Semantics-based English-Bengali EBMT System for translating News Headlines
The paper reports an Example based Machine Translation System for translating News Headlines from English to Bengali. The input headline is initially searched in the Direct Example Base. If it cannot be found, the input headline is tagged and the tagged headline is searched in the Generalized Tagged Example Base. If a match is obtained, the tagged headline in Bengali is retrieved from the examp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010